AITopics | generalized entropy

Collaborating Authors

generalized entropy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Coupled Entropy: A Goldilocks Generalization for Nonextensive Statistical Mechanics

Nelson, Kenric P.

arXiv.org Machine LearningJul-15-2025

Evidence is presented that the accuracy of Nonextensive Statistical Mechanics framework is improved using the coupled entropy, which carefully establishes the physical measures of complex systems. While Nonextensive Statistical Mechanics (NSM) has developed into a powerful toolset, questions have persisted as to how to evaluate whether its proposed solutions properly characterize the uncertainty of heavy-tailed distributions. The entropy of the generalized Pareto distribution (GPD) is $1+κ+\lnσ$, where $κ$ is the shape or nonlinear coupling and $σ$ is the scale. A generalized entropy should retain the uncertainty due to the scale, while minimizing the dependence of the nonlinear coupling. The Tsallis entropy of the GPD instead subtracts a function of the inverse-scale and converges to one as $κ\rightarrow\infty$. Colloquially, the Tsallis entropy is too cold. The normalized Tsallis entropy (NTE) rectifies the positive dependence on the scale but introduces a nonlinear term multiplying the scale and the coupling, making it too hot. The coupled entropy measures the uncertainty of the GPD to be $1+\ln_\fracκ{1+κ}σ=1+\frac{1+κ}κ(σ^\fracκ{1+κ}-1)$, which converges to $σ$ as $κ\rightarrow\infty$. One could say, the coupled entropy allows scientists, engineers, and analysts to eat their porridge, confident that its measure of uncertainty reflects the mathematical physics of the scale of non-exponential distributions while minimizing the dependence on the shape or nonlinear coupling. The training of the coupled variational autoencoder is an example of the unique ability of the coupled entropy to improve the performance of complex systems.

artificial intelligence, entropy, machine learning, (17 more...)

arXiv.org Machine Learning

2506.17229

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

On Calibration in Multi-Distribution Learning

Verma, Rajeev, Fischer, Volker, Nalisnick, Eric

arXiv.org Artificial IntelligenceDec-18-2024

Modern challenges of robustness, fairness, and decision-making in machine learning have led to the formulation of multi-distribution learning (MDL) frameworks in which a predictor is optimized across multiple distributions. We study the calibration properties of MDL to better understand how the predictor performs uniformly across the multiple distributions. Through classical results on decomposing proper scoring losses, we first derive the Bayes optimal rule for MDL, demonstrating that it maximizes the generalized entropy of the associated loss function. Our analysis reveals that while this approach ensures minimal worst-case loss, it can lead to non-uniform calibration errors across the multiple distributions and there is an inherent calibration-refinement trade-off, even at Bayes optimality. Our results highlight a critical limitation: despite the promise of MDL, one must use caution when designing predictors tailored to multiple distributions so as to minimize disparity.

artificial intelligence, machine learning, predictor, (18 more...)

arXiv.org Artificial Intelligence

2412.14142

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > New York (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Robotic Exploration using Generalized Behavioral Entropy

Suresh, Aamodh, Nieto-Granda, Carlos, Martinez, Sonia

arXiv.org Artificial IntelligenceFeb-15-2024

This work presents and evaluates a novel strategy for robotic exploration that leverages human models of uncertainty perception. To do this, we introduce a measure of uncertainty that we term ``Behavioral entropy'', which builds on Prelec's probability weighting from Behavioral Economics. We show that the new operator is an admissible generalized entropy, analyze its theoretical properties and compare it with other common formulations such as Shannon's and Renyi's. In particular, we discuss how the new formulation is more expressive in the sense of measures of sensitivity and perceptiveness to uncertainty introduced here. Then we use Behavioral entropy to define a new type of utility function that can guide a frontier-based environment exploration process. The approach's benefits are illustrated and compared in a Proof-of-Concept and ROS-unity simulation environment with a Clearpath Warthog robot. We show that the robot equipped with Behavioral entropy explores faster than Shannon and Renyi entropies.

behavioral entropy, entropy, exploration, (15 more...)

arXiv.org Artificial Intelligence

2402.10161

Country:

North America > United States > Maryland > Prince George's County > Adelphi (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.46)

Add feedback

A Fair Empirical Risk Minimization with Generalized Entropy

Jin, Youngmi, Gim, Jio, Lee, Tae-Jin, Suh, Young-Joo

arXiv.org Artificial IntelligenceJan-31-2023

This paper studies a parametric family of algorithmic fairness metrics, called generalized entropy, which originally has been used in public welfare and recently introduced to machine learning community. As a meaningful metric to evaluate algorithmic fairness, it requires that generalized entropy specify fairness requirements of a classification problem and the fairness requirements should be realized with small deviation by an algorithm. We investigate the role of generalized entropy as a design parameter for fair classification algorithm through a fair empirical risk minimization with a constraint specified in terms of generalized entropy. We theoretically and experimentally study learnability of the problem.

artificial intelligence, generalized entropy, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2202.11966

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands (0.04)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Education (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Nonlinear Collaborative Scheme for Deep Neural Networks

Zhen, Hui-Ling, Lin, Xi, Tang, Alan Z., Li, Zhenhua, Zhang, Qingfu, Kwong, Sam

arXiv.org Machine LearningNov-3-2018

Conventional research attributes the improvements of generalization ability of deep neural networks either to powerful optimizers or the new network design. Different from them, in this paper, we aim to link the generalization ability of a deep network to optimizing a new objective function. To this end, we propose a \textit{nonlinear collaborative scheme} for deep network training, with the key technique as combining different loss functions in a nonlinear manner. We find that after adaptively tuning the weights of different loss functions, the proposed objective function can efficiently guide the optimization process. What is more, we demonstrate that, from the mathematical perspective, the nonlinear collaborative scheme can lead to (i) smaller KL divergence with respect to optimal solutions; (ii) data-driven stochastic gradient descent; (iii) tighter PAC-Bayes bound. We also prove that its advantage can be strengthened by nonlinearity increasing. To some extent, we bridge the gap between learning (i.e., minimizing the new objective function) and generalization (i.e., minimizing a PAC-Bayes bound) in the new scheme. We also interpret our findings through the experiments on Residual Networks and DenseNet, showing that our new scheme performs superior to single-loss and multi-loss schemes no matter with randomization or not.

artificial intelligence, machine learning, nonlinear collaborative scheme, (16 more...)

arXiv.org Machine Learning

1811.01316

Country: Asia (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Add feedback

Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms

Blondel, Mathieu, Martins, André F. T., Niculae, Vlad

arXiv.org Machine LearningMay-24-2018

We study in this paper Fenchel-Young losses, a generic way to construct convex loss functions from a convex regularizer. We provide an in-depth study of their properties in a broad setting and show that they unify many well-known loss functions. When constructed from a generalized entropy, which includes well-known entropies such as Shannon and Tsallis entropies, we show that Fenchel-Young losses induce a predictive probability distribution and develop an efficient algorithm to compute that distribution for separable entropies. We derive conditions for generalized entropies to yield a distribution with sparse support and losses with a separation margin. Finally, we present both primal and dual algorithms to learn predictive models with generic Fenchel-Young losses.

artificial intelligence, fenchel-young loss, machine learning, (16 more...)

arXiv.org Machine Learning

1805.09717

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Products of Weighted Logic Programs

Cohen, Shay B., Simmons, Robert J., Smith, Noah A.

arXiv.org Artificial IntelligenceJun-15-2010

Weighted logic programming, a generalization of bottom-up logic programming, is a well-suited framework for specifying dynamic programming algorithms. In this setting, proofs correspond to the algorithm's output space, such as a path through a graph or a grammatical derivation, and are given a real-valued score (often interpreted as a probability) that depends on the real weights of the base axioms used in the proof. The desired output is a function over all possible proofs, such as a sum of scores or an optimal score. We describe the PRODUCT transformation, which can merge two weighted logic programs into a new one. The resulting program optimizes a product of proof scores from the original programs, constituting a scoring function known in machine learning as a ``product of experts.'' Through the addition of intuitive constraining side conditions, we show that several important dynamic programming algorithms can be derived by applying PRODUCT to weighted logic programs corresponding to simpler weighted logic programs. In addition, we show how the computation of Kullback-Leibler divergence, an information-theoretic measure, can be interpreted using PRODUCT.

artificial intelligence, logic & formal reasoning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1017/S1471068410000529

1006.3035

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback